Phonetic Question Generation Using Misrecognition

نویسندگان

  • Supphanat Kanokphara
  • Julie Carson-Berndsen
چکیده

Most automatic speech recognition systems are currently based on tied state triphones. These tied states are usually determined by a decision tree. Decision trees can automatically cluster triphone states into many classes according to data available allowing each class to be trained efficiently. In order to achieve higher accuracy, this clustering is constrained by manually generated phonetic questions. Moreover, the tree generated from these phonetic questions can be used to synthesize unseen triphones. The quality of decision trees therefore depends on the quality of the phonetic questions. Unfortunately, manual creation of phonetic questions requires a lot of time and resources. To overcome this problem, this paper is concerned with an alternative method for generating these phonetic questions automatically from misrecognition items. These questions are tested using the standard TIMIT phone recognition task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing

In classical Information Retrieval systems a relevant document will not be retrieved in response to a query if the document and query representations do not share at least one term. This problem is known as “term mismatch”. A similar problem can be found in spoken document retrieval and spoken query processing, where terms misrecognized by the speech recognition process can hinder the retrieval...

متن کامل

Combination of similarity measures for effective spoken document retrieval

Often users of information retrieval systems and document authors use different terms to refer to the same concept. For this simple reason, information retrieval is affected by the ‘term mismatch’ problem. The term mismatch problem does not only have the effect of hindering the retrieval of relevant documents, it also produces bad rankings of relevant documents. A similar problem can be found i...

متن کامل

Generation of robust phonetic set and decision tree for Mandarin using chi-square testing

A phonetic representation of a language is used to describe the corresponding pronunciation and synthesize the acoustic model of any vocabulary. A phonetic representation with smaller phonetic units such as SAMPA-C for Mandarin Chinese and decision trees for parameter sharing are broadly applied to deal with the problem of large numbers of recognition units. However, the confusable phonetic rep...

متن کامل

Automatic question generation for decision tree based state tying

Decision tree based state tying uses so-called phonetic questions to assign triphone states to reasonable acoustic models. These phonetic questions are in fact phonetic categories such as vowels, plosives or fricatives. The assumption behind this is that context phonemes which belong to the same phonetic class have a similar influence on the pronunciation of a phoneme. For a new phoneme set, wh...

متن کامل

Phonetic corpora and big data

During the last years, 'big data' has emerged as a trendy, highly promising portmanteau term in economics and high-tech domains, such as information technology and speech processing. Big data are often described using a 3V scheme: volume, variety, velocity: a huge volume of data, a large variety of possibly unstructured, heterogeneous data sources, a high frequency or velocity of data generatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006